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The method of surrogates is widely used in the field of nonlinear data 
analysis for testing for weak nonlinearities. The two most commonly used al- 
gorithms for generating surrogates are the amplitude adjusted Fourier trans- 
form (AAFT) and the iterated amplitude adjusted Fourier transfom (lAAFT) 
algorithm. Both the AAFT and lAAFT algorithm conserve the amplitude dis- 
tribution in real space and reproduce the power spectrum (PS) of the original 
data set very accurately. 

The basic assumption in both algorithms is that higher-order correlations can 
be wiped out using a Fourier phase randomization procedure. In both cases, 
however, the randomness of the Fourier phases is only imposed before the (first) 
Fourier back tranformation. Until now, it has not been studied how the subse- 
quent remapping and iteration steps may affect the randomness of the phases. 
Using the Lorcnz system as an example, we show that both algorithms may 
create surrogate realizations containing Fourier phase correlations. We present 
two new iterative surrogate data generating methods being able to control the 
randomization of Fourier phases at every iteration step. The resulting surro- 
gate realizations which are truly linear by construction display all properties 
needed for surrogate data. 

Keywords: Surrogates, AAFT, lAAFT, IPAFT, non-linear prediction error, 
lorenz system 

1. Introduction 

The detection of non-linear behavior in data sets is a general problem which 
arises in diverse disciphnes.^^^ Tests for non-Unearity which compare a data 
set to the null hypothesis of a Gaussian linear process are relevant either to 
constrain models or support theories of a particular system. In this context, 
the surrogate data tests are model independent tests for non-linearity^°"^^ 
which can be applied with any non-linear statistics that characterizes a data 
set. Ideal surrogate data should possess not only the same power spectrum 
and amplitude distribution in real space as the original data set but also be 
free of higher-order correlations. The Amplitude Adjusted Fourier Trans- 



December 12, 2008 14:21 



WSPC - Proceedings Trim Size: 9in x 6in surro'rand'phases 



2 



form (AAFT) and the Iterative Amplitude Adjusted Fourier Transform 
(lAAFT) algorithms^"'^^'^^ are the most popular algorithms to generate an 
ensemble of surrogate realizations. AAFT and lAAFT algorithms conserve 
the amplitude distribution in real space and reproduce the power spectrum 
(PS) of the original data set quite accurately. The basic assumption in 
both algorithms is that higher-order correlations can be wiped out using 
a Fourier phase randomization procedure since the PS is invariant under 
Fourier phase permutations. Previous studies"'^'''^*'"'^^ focused on the abil- 
ity of surrogate generating algorithms to accurately reproduce the PS of 
the original data. However, it has been shown that for an ARMA process 
an ensemble of surrogates generated by the AAFT algorithm displays an 
anomalous small variability of the PS when compared to an ensemble of 
identical ARMA processes. This poses the question of the definition of 
the null hypothesis for the surrogate test which will certainly depend upon 
the system under study. More important than exactly reproducing the PS 
of the original data is to create surrogate data free of higher-order correla- 
tions which by definition are linked to the properties of the Fourier phases. 
However, how phase correlations are related to higher-order correlations is 
an important yet unresolved open problem. AAFT and lAAFT algorithms 
impose constraints on the Fourier amplitudes and the amplitude distribu- 
tion in real space without controlling Fourier phases. As shown in the next 
section, these constraints may generate Fourier phase correlations. In the 
rest of this contribution we propose and discuss novel surrogate generating 
algorithms in which the randomness of the phases is explicitly controlled. 



2. Pheise correlations in surrogates 

In this section we introduce the model systems which we use in our study. 

Furthermore we demonstrate, how phase correlations, thus non-linearities, 
can be induced in surrogate data sets. 

2.1. Model systems 

As a linear system we consider the first order ARMA process described by 



Xn+l = 0.7Xn + ^, 



(1) 
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with ^„ being Gaussian noise. 

Furthermore, we use the z-component of the Lorenz system^^ 

x = a{y- x) (2) 
y = x{b-z)-y (3) 
z = xy — cz (4) 

in the chaotic regime , i.e. a = 10, & = 28 and c = 8/3, as an example 
for a time series derived from a deterministic chaotic system, which ex- 
plicitly contains non-linear correlations. The original data were sampled 
at t' = 0.001 and a rank ordered remapping onto a Gaussian distribution 
was performed before applying the surrogate algorithms. The original time 
series resulting from the rank ordered remapping onto a Gaussian distri- 
bution is called zl throughout the text. The length of the times series is 
M = 215. 




Fig. 1. Phase maps for the hnear ARMA process and its surrogates (for details see 
text). 
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Fig. 2. a) Fourier phase map of an lAAFT surrogate realization of for A = 1. b) 
same as (a) but using an AAFT surrogate realization. 



2.2. Phase maps 

A Fourier phase map is a two-dimensional set of points G — {(fiki (t^k+A} 
where (j>k is the phase of the k^^ mode of the Fourier transform and A a 
phase delay. All possible modes k up the Nyquist frequency are considered. 
Using this representation, phase correlations can show up in the maps as 
high/low density regions. 



2.3. Inducing phase correlations 

First, we consider the ARMA process described above. The amplitude dis- 
tribution in real space for the ARMA process is Gaussian. Figure [T] upper 
left panel shows a Fourier phase map for a relization of this ARMA pro- 
cess. As expected, this phase map shows no signatures of phase correlations. 
Suppose that we replace two data point by two spikes thus the distribution 
is almost Gaussian. Figure [T] upper right panel shows the Fourier phase 
map for the spiky time series. It is clear that the presence of spikes has 
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induced Fourier phase correlations. Now, we generate an lAAFT surrogate 
realization of the spiky times series. Figure [l] lower left panel shows the 
Fourier phase map for this lAAFT surrogate where once again the Fourier 
phases are strongly coupled. The first guess is that this strong coupling 
in the phases is only due to the spikes still present in the lAAFT realiza- 
tion and that this will vanish after removing the spikes. However, as shown 
in Fig. [l] lower right panel, phases correlations still remain after removing 
the spikes. Then, this residual phase correlations have been induced during 
the iterative process leading to an lAAFT surrogate which displays signa- 
tures of non-linearity. We conclude that Gaussian remapping must always 
be performed since this step warrants that we obtain random uniformly 
distributed Fourier phases at the first iteration step of lAAFT. 
Second, we calulate AAFT and lAAFT surrogate realisations of the (Gaus- 
sian remapped) z-component of the Lorenz system and represent the 
phase information using phase maps (Fig. [2]) . It becomes immediately ob- 
vious that both surrogate generating algorithms can induce phase correla- 
tions. 

3. The IPAFT algorithm 

We introduce two alternative iterative methods to generate surrogate real- 
izations where the randomization of higher-order correlations is controlled 
at every iteration step by imposing uncorrclatcd uniformly distributed ran- 
dom Fourier phases. For data having a Gaussian amplitude distribution in 
real space, the resulting surrogate realization will have a similar distribution 
in real space, a well-reproduced auto-correlation function, and uncorrelated 
uniformly distributed Fourier phases. 

Algorithm A consists of the following steps. Consider a time series 
Um n — l,...,M. (i) Perform a rank-ordered remapping of the original 
data onto a Gaussian distribution Xn = G'(y„). Remapping a time series 
onto a Gaussian distribution conserves dynamic non-linearities^^ and re- 
duces the amount of whitening of the PS induced by the iterations. Eval- 
uate the Fourier transform of Xn- In addition, generate a random shuffling 
realization of a;„ and calculate its Fourier transform. Since the amplitude 
distribution in real space is Gaussian, we obtain uncorrelated uniformly 
distributed Fourier phases in the interval [— tt, tt]. If we avoid the remap- 
ping step onto a Gaussian distribution then the shuffling step cannot ensure 
uncorrelated uniformly distributed Fourier phases as shown above. "^'^ (ii) 
Combine the Fourier amplitudes of a;„ with the random Fourier phases and 
perform an inverse Fourier transformation. Let us call the resulting time 
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series a;^. (iii) Sort (z„ = 5*2;/ {xn)) and evaluate the Fourier trans- 

form of Zn- Steps (i), (ii), and (iii) describe the AAFT algorithm, i.e. Zn is 
an AAFT surrogate realization. Instead of replacing the Fourier amplitudes 
of Zn by the original Fourier amplitudes, as is the case for lAAFT, we re- 
place the Fourier phases by the random phases obtained in step (i). Using 
the new Fourier amplitudes and the random phases we repeat steps (ii) and 
(iii). Iterations stop after convergence of the probability distribution in real 
space. In the last iteration, step (iii) is avoided thus we obtain a surrogate 
realization containing uncorrelated uniformly distributed Fourier phases. 
Step (iii) may actually reintroduce uncontrolled Fourier phase correlations 
as shown in Fig. [2] (b). Imposing random Fourier phases at every iteration 
step will prevent the algorithm to develop phase correlations during evo- 
lution and so it maintains the randomization of higher-order correlations. 
However, the original PS is imposed only at the first iteration [s] and evolves 
freely through subsequent iterations. This freedom given to Fourier ampli- 
tudes introduces, however, only small deviations to the PS after the first 
iteration step, which is the one that brings the initial flat PS to the desired 
one. Convergence was verified in several tests even in higher dimensions. It 
is expected since a Gaussian linear process is characterized by both a Gaus- 
sian amplitude distribution and uncorrelated uniformly distributed Fourier 
phases. 

Using the ARMA process, we quantified the spectral variability of an en- 
semble of phase-controlled surrogates via a = SfeLi ^^''^p^2'^ i where 
Pk is the power of the fc-mode of a surrogate realization and < Pk > is the 
mean power. The spectral variability is compared with the actual vari- 
ability of an ensemble of ARMA processes using < a >s and as(a). Our 
results (see Table below) show that in spite of allowing for freely evolving 
Fourier amplitudes, the spectral variability of phase-controlled surrogates 
is almost two orders of magnitude smaller than that of the ensemble of 
ARMA processes. Algorithm B tackles this problem by allowing a variance 
of Pk over the set of surrogates given by cr^ — Pk-^^ It is identical to Al- 
gorithm A except for introducing the variance in step (ii) only at the first 
iteration. 



Ensemble 


< a>s 


as{a) 


ARMA process 
Phase-controlled A 
Phase-controlled B 


1.04 
0.016 
0.67 


0.03 
0.006 
0.02 
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Ensemble 




as(A/(-)) 


AAFT 


0.0054 


0.0028 


lAAFT 


s.gxiO"'^ 


4.4x10-^^ 


Phase-controlled A 


1.7x10-8 


1x10-3 


Phase-controlled B 


1.7x10-8 


1.1x10-3 



4. Results 

4.1. Convergence 

We consider the convergence of the amplitude distribution in real space 
as the stopping criterion for the algorithms. Convergence is assessed via 
the relative deviation of the amplitude distribution defined as A/*^*^ = 
X]fcl=o^('"fc^ — Tk)^ / J2h=Q^ ''fc' where r^'"* is the rank-ordered time series re- 
sulting after step (ii) at iteration i, and is the original rank-ordered 
time series. By skipping the last rank-ordered remapping step in AAFT 
and lAAFT algorithms, it is possible to compare the performance of all al- 
gorithms to reproduce the amplitude distribution in real space. The Table 
above shows the results for AI^°°\ i.e. the value of A/'*-' after convergence. 
This table indicates that phase-controlled surrogates reproduce the ampli- 
tude distribution in real space one order (six orders) of magnitude better 
than lAAFT (AAFT), respectively. 

Figure |3]shows A^. versus k and C(r) versus r for the z component of the 
Lorenz system for algorithms A and B after convergence. These results 
indicate that both algorithms lead to well-behaved surrogate realizations. A 
comparison of Figs, [s] (left) and [s] (right) shows that introducing a variance 
in the PS leads to a widening of the la error bars. 



4.2. Applications to the Lorenz- System 

We performed a comparative study of the four different surrogate gener- 
ating algorithms, namely AAFT, lAAFT, phase-controlled A, and phase- 
controlled B surrogates. To this purpose, we applied the non-linear predic- 
tion error (NLPE) method^^'^^ to test for non-linearity. To calculate NLPE, 
the time series is embedded in a c?-dimensional space using the method of 
delay coordinates: Xn = {xn~{d-i)TT ^n~(d-2)Tj • ■ • ? 2;„), where r is the delay 
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Fig. 3. (left) Smoothed Fourier amplitude Aj. versus the wave number k for zi^. The 
inset shows the autocorrelation function C(r) versus the time lag t for zi^. The shaded 
region shows la error bars obtained using 50 phase-controlled surrogate realizations 
generated by algorithm A. (right) Same as the left figure but using 50 phase-controlled 
surrogate realizations generated by algorithm B. 



time. Then, we define the NLPE as 



^id,r,T,N)= J [^n+T-F(f„)p) (5) 

n— (d— l)r 

where is a locally constant predictor, M is the length of the time series, 
and T is the lead time. The predictor F is calculated by averaging over 
future values of the N {N = d+1) nearest neighbors in the delay coordinate 
representation. We have studied the behavior of V' as a function of the lead 
time T. We found that for T > b ip remains rather constant, thus a value 
of T = 10 was used for this test. Using -ip, we evaluated the cr-normalized 
deviation S and the relative deviation R (see caption of Fig.|4|. In all cases, 
T was chosen in order to satisfy the criterion of zero autocorrelation. 
Figure |4|a) shows 4' versus the embedding dimension d for lAAFT, AAFT 
and phase-controlled A surrogates. NLPE distinguishes between the three 
groups of surrogates and indicates that both AAFT and lAAFT surrogate 
realizations are more predictable than phase-controlled A surrogates, i.e. 

^Q— < ip >s is larger for phase-controlled A surrogate realizations 
for all embedding dimensions. It is remarkable that the three groups do 
not overlap. Figure |4]^a) also shows that the largest variability is observed 
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Fig. 4. a) i/i versus the embedding dimension for 2^ (black curve), an ensemble of 
lAAFT (red) , AAFT (blue), and phase-controlled (A) (yellow) surrogate realizations, 
b) The relative deviation R^{d = 5) versus SNR for the three classes. R^{d = 5) = 
\ipO— < ip >s 1/ < V" >S where ipo is the NLPE value for the original data and < ^ >s 
is the mean value of as derived from surrogate realizations, for embedding dimension 
d = 5. The inset shows the cr-normalized deviation versus SNR for the three classes. 
S(d) = \ipo{d)— < "0 >5 \/crg{il>), where ag{^) is the standard deviation of i/j as derived 
from surrogate realizations, c) and d) The same as a) and b) but using phase-controlled 
(B) surrogate realizations, respectively. 



for the set of phase-controlled A surrogate realizations and it decreases for 

AAFT and lAAFT surrogate realizations, respectively. 

The cr-normalized deviation, which is a function of the surrogate ensemble 



variability and 



V'o- < ip >s 



, takes the values 5(5) = 5.1, 6.9, and 5.8 



December 12, 2008 14:21 



WSPC - Proceedings Trim Size: 9in x 6in surro'rand'phases 



10 



for lAAFT, AAFT, and phase controlled A surrogates, respectively. 

We also analyzed the behavior of all surrogate classes in the presence 
of noise. We took and superimposed additive Gaussian white noise 
with varying standard deviation cr„. The signal to noise ratio is defined 
as SNR = (Ts/tTn where Us is the standard deviation of the original time 
series. In the presence of noise, the behavior of ip versus d resembles that of 
Fig. |4|a) but differences among the three surrogate classes are now smaller. 
The inset plot of Fig.gb) shows 5(5) versus SNR for lAAFT, AAFT and 
phase-controlled A surrogates. For SNR < 0.5, the three groups lead to 
non-significant results as expected. For SNR > 0.5, S{5) grows rapidly and 
displays exceptional high values for all surrogate classes for relatively low 
SNR values. In fact, maxima values of S{5) ~ 20, 25, and 30 are observed at 
SNR 1, 1.5, and 2 for phase-controlled A, AAFT, and I A AFT surrogates, 
respectively. All surrogate classes display a counterintuitive decreasing be- 
havior when further increasing SNR. Even for the highest SNR value here 
studied {SNR = 6), lAAFT and AAFT display S{5) values higher than 
the value obtained in the absence of noise. However, for phase-controlled 
A surrogates this feature is only observed for SNR < 3.5. Two different 
effects are responsible for this trend, namely the increase of \ipo— < >s \ 
and the decrease of crs(i/') with decreasing SNR. However, this trend is 
more pronounced for lAAFT and AAFT surrogates than due to the smaller 
variability observed for the standard surrogate data which already becomes 
clear in Fig. [4] (a), crsii^) is approximately one order of magnitude (5 times) 
smaller for lAAFT (AAFT) surrogates than for phase-controlled A surro- 
gates. For SNR ^ 0.5, |V-'o~ < >S \ sharply drops to zero thus yield- 
ing the expected non-significant results for all surrogate classes. Fig. W[h) 




shows the relative deviation R^{5) versus SNR. Note that the order of the 
curves has been mirrored with respect to their order in the inset plot. Since 



shows the expected trend for increasing SNR but it also takes larger val- 
ues when evaluated in the absence of noise (i?0(5) — 0.33, 0.51, and 0.70 for 
lAAFT, AAFT, and phase-controlled A surrogates respectively). Thus, it 
provides a suitable relative measure of significance for different SNR. This 
behavior is also observed for other embedding dimensions. 

Figure Qc) shows ip versus d for lAAFT, AAFT and phase-controlled B 
surrogates. In order to allow for a fair comparison among surrogate classes, 
we modified both the lAAFT and AAFT algorithms by introducing a vari- 
ability in the PS as done in algorithm B. Phase-controlled B surrogates 
are still distinguished as a single group by the NLPE and display the 




S 



R4,<->P>s 



then as{ip) is responsible for this reordering. i?^(5) not only 
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largest variability. In the absence of additive noise phase-controlled B sur- 
rogates lead to the most significant result S{5) = 5.9 followed by AAFT 
{S{5) = 3.6), while lAAFT is non-significant {S{5) = 2.1). The inset plot 
of Fig. shows that in the presence of noise the cr-normalized deviation 
has dropped well below the 3ct threshold for all surrogate classes. However, 
it displays the expected increasing trend with SNR values thus providing 
an equivalent description as the relative deviation R^(J5). 



T 




10 ZO 30 40 50 

No. of AAFT-Surroga.te 



I » I I r 1 I » I I . I 




^0- Of IMFT-SiiTTQgal 



Fig. 5. "0(5) values for 50 AAFT (above) and 50 lAAFT (below) surrogate realizations 
of zi^ for SNR = 4 (red dots) and for 20 phase-controlled (A) surrogates generated from 
each of them (black dots). 
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To investigate the differences between surrogate classes, we considered 
all the 50 AAFT and 50 lAAFT surrogate realizations of zl for SNR = 4 
and generated 20 phase-controlled A surrogates for each of them. Red dots 
in Figure |5]show the values of ip{5) for AAFT and lAAFT surrogate real- 
izations and black dots show the values of for the respective phase- 
controlled A surrogates. We observe that for both AAFT and lAAFT surro- 
gates, phase-controlled A surrogates lead to larger ip{5) values in all cases. 
It is worth noting that in all 2 • 50 = 100 tests, none of the phase-controlled 
A surrogate realizations showed lower (more predictable) values of il^i^) 
than the respective AAFT (lAAFT) surrogate. These differences in iIj{5) 
between standard surrogates and phase-controlled A surrogates generated 
from them become even larger for higher SNR values. Since algorithm A 
accurately reproduces the PS of the original data and phase correlations 
in standard surrogates become stronger for higher values of SNR, we at- 
tribute the differences unveiled by the NLPE to the presence of Fourier 
phase correlations. 

5. Conclusions 

We demostrated that the AAFT and lAAFT algorithm may generate sur- 
rogate realizations containing Fourier phase correlations. Motivated by this 
finding, we presented two new surrogate generating algorithm being able to 
control the randomization of Fourier phases at every iteration step. Using 
the NLPE method wc found clear differences among the surrogate classes 
that we attribute to the randomization level of the Fourier phases of the 
surrogate realizations. These differences appear due to the constraints im- 
posed on the surrogate generating procedure. As shown above, freely evolv- 
ing Fourier amplitudes still lead to ensembles of surrogates having a small 
variability of the PS. Algorithm B deal with this problem by allowing for 
a variability in the PS. The choice of the algorithm will depend upon the 
definition of the null hypothesis for every specific problem. 
In view of our results, phase-controlled surrogate realizations are data sets 
which best reproduce the amplitude distribution in real space and the PS 
of the original data while explicitly fulfilling the constraint that only linear 
correlations are contained. 
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